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1.  Introduction 

In  studying  biological  phenomenon,  one  often  observes  random  variables  which 
are  the  result  of  other  randomly  occurring  unobservable  events.  This  is  usually 
the  case  in  the  observation  of  genetic  traits.  The  measurable  trait  in  question 
has  a  probability  distribution  for  the  population  of  animals  under  study.  Each 
individual  member  of  the  population  of  animals  carries  a  value  of  the  measurable 
trait,  but  it  may  or  may  not  (and  often  is  not)  directly  observable.  It  is  not  diffi¬ 
cult  to  envision  the  probability  distribution  of  the  trait  in  the  population  as  being 
continuous,  while  the  distribution  of  the  visible  expression  of  the  trait  is  a  dis¬ 
crete  count  depending  on  the  value  of  the  measurable  trait. 

Such  a  problem  came  to  the  authors’  attention  during  discussion  with  a  poultry 
scientist  who  was  interested  in  the  probability  distribution  governing  the  fre¬ 
quency  with  which  blood  spotted  eggs  occur.  Poultrymen  wish  to  determine  from 
examination  of  a  small  number  of  eggs  laid  early  in  the  life  of  each  hen  what  the 
average  probability  of  laying  blood  spotted  eggs  is  for  the  flock. 

The  problem  can  be  conceptualized  as  follows.  The  distribution  of  blood  spots 
in  eggs  for  a  given  chicken  is  taken  as  binomial.  That  is,  if  p  represents  the  prob¬ 
ability  of  a  given  chicken  to  lay  a  blood  spotted  egg  and  m  eggs  are  laid,  then 
X  =  number  of  blood  spotted  eggs  is  binomially  distributed  with  parameters  m 
and  p  assuming  the  eggs  are  laid  independently.  However,  the  probability  p  (or 
propensity)  for  laying  blood  spotted  eggs  (the  trait  in  question),  differs  from 
chicken  to  chicken  and  can  be  thought  of  as  having  a  continuous  distribution 
on  the  unit  interval.  The  probability  distribution  of  the  blood  spotting  trait  p 
in  the  population  is  not  directly  observable.  That  is,  one  might  postulate  that  the 
binomial  parameter  p  (or  trait)  has  a  distribution  on  the  unit  interval  and  that 
the  values  of  this  probability  carried  by  each  bird  in  the  flock  are  independently 
allocated  according  to  this  distribution,  denoted  G{p).  Rarely,  if  ever,  are  values 
of  p  directly  observable. 
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Thus,  the  model  is  written  as 

(1.1)  f(x;  m)  =  Jq  Px(!  ~  V)m~x  dG(p), 

x  =  0,  •  •  •  ,  m,  where  m  is  the  size  of  the  sample  examined  and  G{p )  is  the  dis¬ 
tribution  of  the  probability  p. 

The  problem  we  consider  here  is  that  of  estimating  the  mean  yp  of  the  prob¬ 
ability  distribution  G(p),  that  is,  y  =  yp  =  J  p  dG(p),  based  on  a  random  sample 
X\,  •  •  •  ,  Xn  from  f(x)  m),  where  X*  is  the  number  of  blood  spotted  eggs  among 
the  m  eggs  sampled  from  the  fth  chicken.  The  moment  estimator  of  yp  and  its 
large  sample  properties  along  with  confidence  intervals  are  developed  in  Section 
2. 

It  is  clear  that  the  above  type  of  sampling,  when  we  wish  to  estimate  yp, 
occurs  in  many  setups  similar  to  that  of  our  chicken  example.  Also,  the  allied 
problem,  when  the  sample  size  m  is  allowed  to  vary  from  individual  to  individual, 
is  discussed  in  Sections  3,  4,  and  5.  In  that  case,  the  distribution  of  each  X,  is 
given  by  equation  (1.1),  where  m  is  now  replaced  by  m,;  that  is,  Xt  is  distributed 
with  discrete  density /(£,;  mt)  in  (1.1),  Xi  =  0,  •  •  •  ,  mif  i  =  1,  •  •  •  ,  n. 

Theorems  5.1  and  5.2  develop  the  consistency  and  asymptotic  distribution 
theory  for  the  case  of  differing  sample  sizes  (w,-  different).  These  theorems  con¬ 
cern  an  estimator  for  y  which  behaves  asymptotically  like  the  minimum  vari¬ 
ance  unbiased  linear  (in  the  Xi)  estimator  of  y  which  is  studied  in  Section  3. 

Some  aspects  of  this  general  problem  are  covered  by  Pearson  [2]  who  discusses 
Bayes  theorem  in  the  light  of  experimental  sampling.  However,  no  attack  on  the 
above  problem  is  made  therein. 

2.  Estimation  of  the  mean  of  G(p) 

Let  Xi,  •  •  •  ,  Xn  be  a  random  sample  from  the  model 

(2.1)  px(  1  -  p)m~xdG(p). 

We  consider  the  problem  of  estimating  the  mean 

(2.2)  y  =  /g1  V  dG(p) 

based  only  on  the  observations  Xi,  •  •  •  ,  Xn.  Note  that  in  fact  there  exists  a 
bivariate  random  sample  {(X,,  Pi),  i  =  1,  •  •  •  ,  n} ,  where  we  assume  that  Xi 
conditional  on  Pi  =  p  is  binomially  distributed  with  m  trials  and  success  prob¬ 
ability  p  and  the  Pi  are  independent  marginally  distributed  as  G(p).  We  write 
Xi| Pi  —  P  ~  b(ni,  p)  and  Pi  ~  G(p),  i  =  1,  •  •  •  ,  n. 

We  employ  the  method  of  moments  to  obtain  our  estimator.  Observe  that 

(2.3)  E(X1)  =  XIXfXajPi]}  =  mE(Pi)  =  mM. 

From  (2.3)  and  the  fact  that  Xh  •  •  •  ,  Xn  are  independent  and  identically  dis- 
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tributed  with  distribution  (2.1),  we  have  EX  —  m/x.  Hence,  the  method  of  mo¬ 
ments  yields  the  estimator  ft  of  n  given  by 


(2.4) 


—  E  Xo 

mn 


From  the  strong  law  of  large  numbers,  we  have  immediately  that  ft  is  a  strongly 
consistent  estimator  of  tx.  That  is, 

(2.5)  Pflim  ft  =  n)  =  1. 

n— >» 


Next  we  will  obtain  the  large  sample  distribution  of  ft  quite  directly  from  the 
central  limit  theorem.  First,  observe  that  the  variance  of  ft  is  given  by 

(2.6)  Var  <*)  =  Var  g)  =  A, 

where  crj  =  Var  (Xi).  Therefore,  by  the  central  limit  theorem  for  independent, 
identically  distributed  random  variables  with  finite  variance,  we  have  as  n  — »  oo 

(2.7)  1), 

where  £(Z„)  — >  N(n,  a2)  means  {Zn}  converges  in  distribution  to  a  random  vari¬ 
able  Z  which  is  normally  distributed  with  mean  fx  and  variance  a2. 

Besides  fi  being  strongly  consistent  as  in  (2.5)  and  asymptotically  normal  as 
in  (2.7),  we  note  that  ft  in  (2.4)  is  the  minimum  variance  unbiased  linear  (in  the 
Xi)  estimator  of  /x-  This  is  a  direct  result  of  Theorem  3.1  in  Section  3. 
Furthermore,  by  defining 

(2.8)  £2  =  (n  -  I)-*  E  (Xi  -  X )2, 

t=i 

we  have  S2  in  an  unbiased,  consistent  estimator  of  a\.  That  is, 

(2.9)  S2  i  a! 


as  n  — >  oo.  Using  (2.9)  and  (2.7),  we  obtain  (see  for  example,  Rao  [3],  (x)  —  (b), 
p.  102)  as  n  — >  oo 

(2.10)  £/mvj|~_g)\  m  1} 

From  (2.10),  we  can  immediately  give  a  100(1  —  a)  per  cent,  0  <  a  <  1, 
large  sample  confidence  interval  for  n,  since 

(2-u>  z- s)<"<r„(x  +  ^z-s)}  =  1-“- 

where  Za/i  is  defined  by  the  equation 

For  example,  if  we  want  a  95  per  cent  confidence  interval  for  the  mean  prob- 
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ability  of  the  flock  for  laying  a  blood  spotted  egg,  we  would  choose  a  =  0.05 
yielding  an  approximate  large  sample  confidence  interval 

1.96*! 


(2.13) 


(I  h 

\m\  Vn  J  m\ 


X  + 


L.96S\\ 

^n)j 


Some  sample  intervals  are  constructed  for  data  randomly  generated  by  a  sim- 
ilation  process  involving  varying  sample  sizes  and  are  here  included. 

The  density  given  by  (2.1)  was  computed  for  the  case  of  G{p)  being  a  beta 
distribution.  That  is,  assume  dG(p)  =  g(p)  dp,  where  the  density  g(p)  is  given  by 

(2.14)  g(p)  =  {i 8(r,  s)}-1pr“1(l  “  P)s~\  0  <  p  <  1,  r  >  0,  s  >  0, 

and  j 3(r,  s)  =  f*  pr-1(l  —  p)8-1  dp.  A  range  of  values  of  m,  r,  and  s  was  chosen 

and  y  computed  for  each  set  of  values  thereof.  Random  samples  were  drawn  and 
fl,  and  Var  (fi)  estimated.  Table  I  gives  y,  fi,  and  the  estimated  standard  error 
of  ft,  S/  ( mVn ),  for  a  few  selected  values  of  m,  r,  and  s.  The  entry  on  the  first  line 
is  for  a  sample  of  n  =  50  and  on  the  second  line  for  a  sample  of  n  =  200.  In 
most  instances  (L  estimates  y.  well;  (L  ±  1.96*S/mV n  fails  to  contain  y  only  three 
times  out  of  the  40  cases  presented.  That  is,  when  m  —  10,  r  —  s  =  1,  n  —  50; 
m  =  15,  r  =  1,  $  =  5,  n  =  200;  and  m  =  15,  r  =  2,  s  —  15,  n  =  200.  Many 
other  values  of  m,  r,  s,  and  n  were  also  tried  with  similar  good  results. 


3.  The  case  of  differing  sample  sizes 

Often  times  in  applications  the  number  of  trials  m  connected  with  each  ob¬ 
servation  may  not  be  the  same.  That  is,  consider  the  case  where  each  Xi  is 
distributed  as  (2.1)  with  m  replaced  by  m,-,  i  =  1,  •  •  •  ,  n.  We  assume  the  ro;  are 
all  known,  fixed,  positive  integers,  but  not  necessarily  equal. 

To  estimate  y,  we  again  use  the  method  of  moments.  Similar  to  (2.3),  we  have 

(3.1)  E(Xi)  =  rriiy,  i  =  1,  •  •  •  ,  n. 

Since  (3.1)  implies  both  E{Ei- 1  (Xi/mx)}  =  ny  and  #CC"=i  Xx)  =  (£”=i  ml)y, 
we  have  as  possible  moment  estimators  of  y  both 


1  n  x  ■ 

di  =  -  E  — 

Ui=i  rrii 


(3.2) 
and 

(3.3)  /22  = 

m 

where  m  —  (1/n)  mx. 

In  order  to  discuss  the  relative  merits  of  the  estimators  (h  and  /2 2,  we  compute 
their  variances.  With 

(3.4)  =  Var  (Pt)  =  /  (p  -  M)!  dG(p) 
and 

(3.5) 


=  £{Pi(l  -  p.)}  =  /  P(  1  -  V)  dG(p), 
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TABLE  I 

Comparison  of  h  and  fi.  for  Case  of  Beta 
Distribution,  n  =  50  and  n  —  200 


m 

r 

s 

Vp 

S/(m,y/n ) 

1 

.500 

.594 

.0411 

.505 

.0217 

2 

.333 

.314 

.0366 

.342 

.0198 

5 

.167 

.170 

.0241 

.158 

.0125 

10 

.091 

.100 

.0204 

.093 

.0087 

15 

.063 

.060 

.0121 

.060 

.0068 

1 

.667 

.698 

.0366 

.688 

.0177 

2 

.500 

.548 

.0388 

.482 

.0181 

5 

.286 

.314 

.0287 

.298 

.0146 

10 

.167 

.178 

.0225 

.154 

.0106 

15 

.118 

.120 

.0232 

.116 

.0082 

1 

.500 

.489 

.0470 

.474 

.0218 

2 

.333 

.295 

.0397 

.327 

.0179 

5 

.167 

.197 

.0268 

.190 

.0129 

10 

.091 

.116 

.0209 

.096 

.0082 

15 

.063 

.052 

.0107 

.062 

.0057 

1 

.667 

.653 

.0341 

.657 

.0183 

2 

.500 

.449 

.0386 

.485 

.0165 

5 

.286 

.320 

.0306 

.285 

.0136 

10 

.167 

.192 

.0225 

.167 

.0102 

15 

.118 

.112 

.0175 

.118 

.0075 

we  have 

(3.6)  Var  (&)  =  (WLn)~2  £  Var  (X<) 

»=i 

=  n-1(o„o-2  -f-  6„r), 

where  a„  =  (m2n)_l  £?= i  mu  bn  =  (m)_1,  m  —  n~x  ^<n=  i  m*. 
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Also,  we  obtain 

(3.7)  Var  (pi)  =  n~2  13  Var  (Z<)  =  n~l(<r2  +  cnr), 

*=i 

where  c„  =  nr1 13”=  i  mf1. 

From  the  Cauchy-Schwarz  inequality,  it  is  easy  to  show  that 

(3.8)  an  ^  1  and  6„  ^  c„, 

where  the  inequalities  are  strict  unless  rrii  =  m  for  all  i.  Hence,  it  is  clear  that 
neither  Pi  nor  /22  is  for  all  G  relatively  more  efficient  than  the  other.  In  fact,  from 

(3.8)  we  see  that  if  o-2  =  0  and  r  >  0,  p2  is  more  efficient,  Var  (pi)  ^  Var  (fa), 
than  fa,  while  the  reverse  is  true  if  <r2  >  0  and  r  =  0. 

Note  the  case  <j2  =  0  implies  that  G(p)  is  degenerate  at,  say,  po,  and  hence 
that  Sn  =  13?- 1  X,  is  binomially  distributed  with  parameter  13?- 1  m,  and  p0. 
Thus,  fa  in  this  case  becomes  the  classical  (maximum  likelihood,  moment  and 
minimum  variance  unbiased  estimator)  solution  to  the  problem  of  estimating 
p  =  p0.  In  the  remainder  of  the  paper,  we  omit  this  case  from  consideration  and  shall 
assume  <r2  >  0. 

We  consider  now  the  question  of  the  existence  of  an  optimal  solution  in  the 
minimum  variance  sense.  The  following  theorem  gives  a  solution  to  the  problem 
for  unbiased  linear  (in  Z»)  estimators. 

Let  Mn  be  the  class  of  all  unbiased  linear  estimators  of  p  based  on  Xh  •  •  •  ,  X„. 
That  is, 

(3.9)  M.  =  U\H  =  f  Cin  t  Ci.  =  l)- 

Observe  that  the  condition  13?- i  c,„  =  1  implies  that  (L  is  unbiased  by  (3.1). 
Also,  jd  G  Mn  is  clearly  linear  in  the  Z*  as  well  as  in  the  Xi/mi  by  taking  d„  = 
Cin/m i  and  defining  (L  =  13?- 1  c'inXi  in  Mn. 

Theorem  3.1.  The  minimum  variance  unbiased  linear  estimate  of  p  (that  is, 
the  fi  e  Mn  of  minimum  variance )  is  given  by 

(3.10)  A>=£&fi< 

i=  1  ifv% 

where 

(3.11)  Cin  =  t)  =  (<7^  +  ^}  /  t  {«*  +  ’ 

with  a2  and  r  as  in  (3.4)  and  (3.5) . 

Remark.  In  particular,  fa  =  fa  =  fa  if  all  =  m  (see  Section  1),  and  fa  = 
fa  if  <r2  >  0,  r  =  0,  and  fa  =  fa  if  a2  =  0,  r  >  0. 

Proof.  Let  a?  =  Var  (Xi/mi)  =  a2  4-  r/m*.  Then,  for  (L  G  Mn,  we  have 
Var  (fa  =  X?=  i  etna-?,  which  is  minimized  by  taking  cin  =  c°in  as  in  (3.11).  (See 
for  example,  Rao  [3],  2.2,  p.  249.) 

Theorem  3.2.  If  a2  >  0,  then  P{ limn_*„  fa  =  p]  =  1  for  any  sequence  {mn} 
of  positive  integers. 
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Proof.  Let  F,  =  <rf  2(Xt/m,),  where  o-2  =  Var  (Xt/m»)  =  <r2  +  r/m,-.  Le 
6*  =  E?«i  <rf 2  and  observe  that  (see  Lofcve  [1],  16.3,  II,  A,  p.  238)  fio  —  m  = 
1  E?- 1  (I7*  —  EYi)  — >  0  with  probability  1  as  n  — »  »  provided 


(3.12)  E  2  Var  (Fn)  <  «. 

n  =  l 

But  since  fe„  ^  n(<r2  +  r)-1  and  Var  (F<)  =  cf 2  ^  a--2,  we  see  that  (3.12)  holds 
since  E»  - 1  w-2  <  °o . 

Theorem  3.3.  7/  <r2  >  a,  then  for  any  sequence  of  positive  integers  {ra„}, 


(3-13) 

as  n  — >  oo ,  where 

(3.14)  Var  (Ac)  =  {£  <rr2}~‘  =  (t  (<r!  +  r/m()-4'1. 


Proof.  Let  F  t-n  —  cinirrii  X  i  /i) ,  o'  in  —  Var  ( F  jn) ,  and  s«  —  = i  oin. 
Then,  by  an  extended  version  of  the  Liapounov  theorem  (Lofcve  [1],  20.1,  a, 
p.  277),  we  have 


(3.15) 


£ 


(  (k  ~  Ho  \ 

V(Var  W)») 


N(  0, 1) 


as  n  — >  oo  provided 
(3.16) 


Snzt  E\Yin\*->0 

i=i 


as  n  — >  co .  But  since  s2  =  (E?=i  ^  2}-1  and  c?„  =  slof2  with  <r2  =  o-2  +  r/m,-, 
we  have 


(3.17) 


E  £'|  Ftn|3  =  s3nt  ar«E 
1=1  1=1 


X_i 

rrii 


|3 


M 


^  si  £  o-r4 

t=i 

^  7l~^(<r2  +  r)^-4, 

where  the  last  inequality  follows  by  using  (c2  +  r)-1  ^  <rf 2  <r~2.  Hence,  (3.16) 

holds  and  the  theorem  is  proved. 

We  note  that  Theorem  3.3  immediately  yields  large  sample  confidence  inter¬ 
vals  on  fi  provided  a2  and  r  are  known.  Under  the  condition  of  Theorem  3.3  a 
100(1  —  a)  per  cent  large  sample  approximate  confidence  interval  for  h  is  given 
by 

(3.18)  (fio  —  Sn,  /2o  +  Sn), 

where  en  =  F„/2{E?-i  (c2  +  r/w,)-1}-^  and  fo  =  E?=i  <hn(Xi/ml).  However,  in 
most  applications  <r2  and  r  remain  unknown  and  we  must  therefore  concern 
ourselves  with  this  case.  Section  4  discusses  the  question  of  estimating  a2  and  r, 
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while  Section  5  develops  the  necessary  large  sample  results  for  estimating  m  when 
o2  and  r  are  unknown  and  the  sample  sizes  differ. 

4.  Estimation  of  a2  and  r 

Define  the  random  variables  Yif  Zi  and  the  indicator  variables  i  —  1,  •  •  •  , 
n,  as  follows: 


(4.1) 


(4.2) 


Yi  =  —i 

nii 


'Xi(rrii  -  Xi) 


if  >  1, 


and 

(4.3) 


Zi  =  \  mi(mi  -  1) 

>-0  if  rrii  =  1, 


8i  = 


["1  if  rrii  >  1, 

to  if  rrii  =  1. 

Observing  that  EX j  =  EXi{Xi  —  1)  +  EXi  =  mfE(Pf)  +  ra»T,  one  obtains 
(4.4)  EZi  =  Str. 

Furthermore,  using  the  relationship 


E  (Yi  -  Y)2  =  E  (Yi  -  m)2  -  n(Y  -  M)2, 

*= i  *-i 


(4.5) 

it  can  easily  be  shown  that 

(4.6)  *{£  ir.  -  fy}  =(»-!)  {-•  +  g  ±^)r} 
From  equations  (4.4)  and  (4.6)  and  defining 

(4.7)  SI  -  (n  -  l)"1  £  (7,-  -  D* 

i=i 

and 

(4.8)  an  —  max  j^E  8t,  l|> 

we  obtain  as  moment  estimators  of  t  and  a2,  when  E?=i  8,  >  0, 


(4.9) 
and 

(4.10) 


E  Zi 


(Os  =  Si  -  t  g  ±  AV 

\n  t=i  vfiij 


Note  that  from  (4.4)  and  (4.6)  the  unbiasedness  of  r  and  (<r*)2  follows.  That  is, 
when  E?-i  Si  >  0, 

(4.11)  E(  f)  =  r,  E{(<r*)2}  =  <t2. 

The  estimator  (o*)2  in  (4.9)  may  be  negative  as  an  estimator  of  a2  >  0.  We 
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shall  find  it  convenient  to  modify  (<r*)2  for  later  purposes  and  we  define  d2  as  the 
following  positive  truncation  of  (cr*)2, 

(4.12)  d2  =  max  {(c*)2,  w-1}. 

The  following  theorem  gives  the  consistency  properties  of  r,  d2  (and  (<r*)2). 
Theorem  4.1.  Let  a2  >  0.  If  {mn}  is  any  sequence  of  'positive  integers  for 
which  an  — >  oo  as  n  oo  ( see  (4.8)),  then  as  n  — >  «>,  f  — »  r  and  d2  — >  a2  (or 
(a*)2  — »  a2)  in  probability.  Furthermore ,  if  {mn}  is  such  that  E"=i  an  2  <  °°,  the 
convergences  hold  with  probability  one. 

Proof.  Observe  that  since  0  ^  Zi  ^  }/£,  we  have 


(4.13)  Var  (f)  =  a„“2  £  Var  (Zi)  ^  (4a*)-1. 

i  =  1 


Hence,  f  — >  r  in  probability  as  n  — >  oo  by  Chebyshev’s  inequality.  To  prove  con¬ 
vergence  with  probability  one  for  f,  it  suffices  (by  Lokve  [1],  16.3,  II,  A,  p.  238) 
to  verify  that  2  Var  (Z„)  <  <x>,  which  clearly  holds  in  E”=i  <  00 , 

since  Yar  (Zn)  ^  3 4- 

Observe  that  (cr*)2  is  linear  in  f  in  (4.10).  Thus,  convergence  of  (a*)2  to  <r2  in 


P  a.s. 

probability  (—>)  or  with  probability  one  (— >)  as  n  — >  oo  follows  from  Theorem  3.2 
provided 


(4.14) 


S\  -  L*  +  -  t  -)  “ 

\  ni=1niij 


as  n  — >  oo . 
as  n  — » oo , 

(4.15) 


But  (4.14)  follows  immediately  from  (4.5),  (4.6),  and  (4.7)  provided 


l  t  (  Y<  -  -  U*  + 

ni=i  \ 


a4o. 


Let  X[  —  (Yi  —  p)2  in  (4.15)  and  write  the  left  side  of  (4.15)  as  n~x 
E”=i  (X'i  —  EX'i).  Now,  applying  a  version  of  the  Kolmogorov  strong  law  of 
large  numbers  (see  Lo&ve  [1],  16.3,  II,  A,  p.  238),  we  see  (4.15)  holds  provided 


(4.16) 


L  n~2  Var  (X'n)  < 


71  =  1 


00  . 


But  the  convergence  of  the  series  in  (4.16)  is  an  immediate  consequence  of  the 
boundedness  of  Xh  by  one.  Thus,  the  theorem  is  proved. 

Remark.  We  observe  that  the  condition  a„  — » »  is  necessary  in  Theorem 
4.1.  To  see  this  consider  the  case  where  all  the  m,'  are  1  or  2.  Then  an  °°  implies 
there  exist  nQ  such  that  8n  =  0  ( mn  =  1)  for  n  ^  n0.  Thus  f  =  a^1  E”=i  for 

p 

all  n  ^  n0  and  clearly  f  r  as  n  — »  » . 


5.  Estimation  of  n  when  cr2  and  r  are  unknown  in  the  differing 
sample  size  case 

The  minimum  variance  unbiased  linear  estimate  of  n  in  Theorem  3.1  depends 
on  knowing  <r2  and  r  for  the  optimal  choice  of  the  constants  c°n  in  (3.11).  To  over- 
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come  this  problem  when  <r2  and  r  are  unknown,  we  propose  and  study  an  esti¬ 
mator  of  ju,  denoted  po,  which  chooses  the  cin  in  (3.11)  based  on  the  estimators 
&2,  t  of  a2,  t  given  in  the  previous  section.  Theorems  5.1  and  5.2  give  the  large 
sample  properties  of  the  proposed  estimator. 

Specifically,  let 


(5.1) 


if  f  =  0, 

if  r  >  0, 


where  f  and  a2  are  defined  by  (4.9),  (4.10),  and  (4.12).  Now  define 


(5.2) 


Mo 


£ 


*=i 


It  will  be  shown  that  under  appropriate  conditions  /to  — ♦  u  in  probability  as 
n— ><»  (see  Theorem  5.1).  Before  proving  this  theorem,  however,  we  develop 
the  following  lemma. 

Lemma  5.1.  If  <r2  >  0  and  an  — » <»  as  n  — >  °o  {see  (4.3)  and  (4.8)),  then 

(5.3)  Mo  =  j&o  +  (<?2  —  o'2)  £  a  in  —  m) 

*=i  \rrii  / 

+  [|<?2  —  o’2 1  +  |f  -  t|]2Op(1)/2i, 


where  ain  and  pin  are  nonrandom  coefficients  such  that  i  a,n  =  £)”=  i  j 3in  =  0 
and  Op{  1)  indicates  a  random  factor  which  is  bounded  in  'probability. 

Proof.  Let 


Xu,  V)  =  {t  (u  +  }  («  +  „”()  - 


(5.4) 


Oiin 


dCjn 

du  u = o-*,® = T 


a  _  dCin 


U  —  O’*,®  =  T 


Observe  that  dn(u,  v )  has  continuous  second  order  partial  (and  mixed  partial) 
derivatives  on  the  set  {{u,  v) :  0  <  u  <  & ,  0  v  <  &} ,  where  the  partial  deriv¬ 
ative  is  defined  from  the  right  at  v  =  0.  Hence,  we  have  the  following  second 
order  Taylor  expansion  for  ct„(tf2,  f), 


(5.5)  d„{& 2,  f)  =  C,n(o-2,  r)  +  (&2  —  a2)ain  +  (f  —  r)/3,n 


(<?2- 

<r2)2  d2dn 

2 

du2 
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+  (<?2  —  <r2)(?  —  r) 


du  dv\u=J,v=vt 


,  (r  -  r)2  d2Cj„ 

2  dv2 

where  min  (<r2,  &2)  ^  u*  ^  max  (a2,  a2)  and  min  (r,  max  (r,  f).  Ob¬ 

serve  that 

(56)  ac<.  {(*  +  £) 1  &  (M  +  £)  *}  ~  {(“ +  2 £  (*  +  a)  2 

tt(-+37 

(5.7) 

ac„  { (»  +  ,1  £,(*  +  £)  1  ~  (M  +  2  |j  (u  +  £) 


2  +  — )  3  2  («  +  ±-)  1  ±  (u  +  — )  3 

/er  p\  d  Cin  _  V  Wl»/  \  Wl*/  /=  1  \  W// 

-  £KhT  {i(-^r}' 

*  (*•'•»)  +  ’)'  ’,5(*  +  £)  ’ 


i,  (• + i)T 


[?,(*+ 4)7 


2  /  ,  ®  V3  O  /  _L  »  V1  V'  1  (  ,  ®  Vs 

.  d2Cin  _  rrii  V  mj _ \  mj  y^rnH  m,/ 

ts(-^)7 


{*,(-+i)7 


,?.(-+i)7 


(5.10) 


d2c,„  m 


m7  +  ^)  2(M  +  ^)  S +  7 
(M  +  ^)  ,l7y(M  +  ^)  ^(M  +  ^)  ,g,(M  +  ^) 


te(-+a7 


{£(•+37 


260  SIXTH  BERKELEY  SYMPOSIUM:  SOUTHWARD  AND  VAN  RYZIN 


+ 


2(“+^) 

r n 

L 

U  =  i 

/  , 

(w  H - 

V  rrij) 

il 

ft —(«  +  —)  !) 

u=iw*A  *»i/  j 

{ 

n  i 

£ 
j— i 

(“  + 

-)‘T 

mi/  J 

2(*+3  ’  &  (‘ + i) 

:#,(•+ 37 


Note  that  from  (5.6)  and  (5.7),  we  see  that  £<=i  a,n  =  £?=i  /3»n  =  0.  Thus,  we 
have  from  (5.2)  and  the  Taylor  expansion  (5.5), 


{ 


(5.11)  juo  —  i  Ao  +  (#2  —  <r2)  H  cun 


t  =  i 

(<x2  —  a2)2 


A  Xj  /d2CjJ  \ 

t  =  1  771 «  \  5w2  |u  =  Mi,»=»t/ 

+  (».  -  ,*)(*  -  t)  t  §S 


+ 


(f  -  r)2  *  Xi  /d2c 


2 


A  Xj  /d2Cin  \ 

i=i  711%  \  dv2  u=ui,v=vt) 


But  the  right  side  of  (5.11)  is  bounded  by 

(5.12)  {|*'-^|+|»-»|}‘{^^i}fc 

where  u  =  max»  u*,  u  =  min,-  u*,  and  v  =  max,  vf,  since  from  (5.8)  through 
(5.10)  and  repeated  use  of  the  inequalities 

(5.13)  (uf  +  vi 
it  is  easy  to  show  that 

(5.14) 


and 

(5.15) 


Bounding  the  right  side  of  (5.11)  by  (5.12)  and  noting  that  Theorem  4.1 
implies  that  as  n  — >  « , 

( U )“3  P 


7S| 

*  .  Vi  \ 
<  m,/ 

l/^Cin 

\ 

|\3w2 

|/32Ct>i 

*  w 

U  —  Ui  tV=Vi/ 

) 

IV  di>2 

*  */ 

U  =  Ut  ,V  —Vi  / 

/  d2cjn 

\ 

\du  dv 

*  *  / 
W  =  Ui  ,v=t>i  / 

(5.16) 


i=i>0. 


\u  +  u)-1  (cr2  +  r)~ 
we  have  that  5(w)~3(w  +  v)  is  Op(l).  Hence,  the  lemma  is  proved. 
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Theorem  5.1.  If  <r2  >  0  and  an  — » oo  as  n  — » °o  ( see  (4.3)  and  (4.8)),  then  $ 

p 

defined  by  (5.1)  and  (5.2)  is  such  that  /io  — »  n  as  n  — »  » . 

Proof.  Repeated  use  of  (<r2  +  t)_1  ^  (o-2  -f  t/wi,)-1  ^  o--2  and  mf 1  ^  1 
in  (5.6)  and  (5.7)  imply  |a»»|  and  £”=i  [/3»n|  are  bounded  by  <r-4(o-2  +  r)-1. 
Hence,  bounding  |(Xi/w»)  —  m!  =  1  in  (5.3)  and  invoking  the  convergences  in 
Theorems  3.2  and  4.1,  expansion  (5.3)  yields  the  result. 

p 

Lemma  5.2.  If  n^a^1  — >  0  as  n  — *  »,  then  n^{&2  —  <r2)  — >  0  and 

p 

n^(?  —  r)  — >  0  as  n  — >  oo . 

Proof.  From  (4.2)  we  have  0  ^  Z*  ^  34  and  Var  (Zt)  ^  34  and  therefore, 

(5.17)  Var  (n^f)  =  n^a«2  £  5,  Var  (Z<)  ^  jn^an'1. 

t  =  i  4 

Thus,  by  Chebyshev’s  inequality,  we  have 

(5.18)  n^(r-r)4o 
as  n  — *  oo . 

Next,  in  (4.5)  and  (4.7)  observe  that 

(5.19)  (n  -  l)Si  =  £  (Yi-  /i)2  -  n(P  -  M)2  ^  2n(F  -  p)2. 

»=i 

Since  &2  is  a  nonnegative  random  variable  and  |F  —  /z|  ^  1,  it  follows  that 

(5.20)  Var  (S?)  £  ESi  £  (rffS*  Var  (7) 

But  this  inequality  together  with  ml1  ^  1  imply  that  Var  (n^Si)  is 
0(n_^(<r2  +  r)).  Hence,  again  by  Chebyshev’s  inequality  and  (4.6),  we  have 

(5.21)  n*{si-  [,’  +  r  g  ±±)]}  A  0 

as  n— >oo.  This  result  together  with  (5.18)  combine  in  (4.10)  to  yield 

p 

n^[(<r*)2  —  o-2]  — >  0  as  n  — >  oo  which  completes  the  proof  of  the  lemma  by  using 
the  definition  of  &2. 

Theorem  5.2.  If  an  — *  <x>  and  n^aii1  — >  0  as  n  — >  oo  ( see  (4.3)  and  (4.8))  and 
a2  >  0,  then 

<5-22>  ^((vot)^0’1) 

as  n  — >  oo .  Furthermore ,  replacing 

Var  (A)  =  l/[t(^  +  f)  ’] 

by 

^  p,-i/K,(4,+y} 

the  result  (5.22)  still  holds. 
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Proof.  Using  the  Taylor  expansion  (5.3)  of  Lemma  5.1,  write 

.  _  .  -  <r2)  \nX  t  ctin  (-  - 

/r  <)A\  MO  M  fid  M  I  _ L  i  =  1 _ V^£ _ /  J 

K  )  (Var  fa))*  (Var  (A,))*  +  (» Var  (A,))* 

+  »*(*-,)[,*£&.(%-,)] 

(n  Var  (/to))3'* 

{n^2-g2| +n*i?-rl}2Op(l)/h 
(n  Var  (A>))** 

From  (5.24)  and  Theorem  3.3,  the  theorem  will  be  completed  provided  the  last 
three  terms  on  the  right  side  of  (5.24)  converge  to  zero  in  probability  as  n  — >  <» . 
However,  such  is  the  case  from  Lemma  5.2  provided 

(5.25)  lim  inf  {n  Var  (/2o)}  ^  c  >  0 
and  that  as  n  — »  oo , 

(5.26)  Kl,  4  0 

and 

(5.27) 

The  result  (5.25)  follows  by  noting  that 

(5.28)  n  Var  (/to)  =  n  (a2  -f-  (^f)  }  —  0-2  >  0  for  all  n- 

The  result  (5.26)  follows  immediately  from  Chebyshev’s  inequality  upon  ob¬ 
serving  from  (5.6)  we  have  |a,»|  ^  2cr~i/n(a2  +  r)-1,  which  implies 

(5.29)  Var{a,(|i.-M))  =  «?»(^  +  ^) 

^  4<r-6 

“  n2(<r2  +  r)"2’ 

A  similar  argument  implies  (5.27)  holds.  This  completes  the  proof  of  the  theorem. 

Remark.  A  100(1  —  a)  per  cent,  0  <  a  <  1,  large  sample  confidence  in¬ 
terval  for  n  when  <r2  and  t  are  unknown  which  is  close  to  being  optimal  in  the 
sense  that  asymptotically  it  is  the  same  as  that  based  on  (k  in  (3.10)  (the  mini¬ 
mum  variance  unbiased  linear  estimator  of  n)  is  given  from  (5.22)  and  (5.23)  by 

(5.30)  lim  P{n o  —  Za/2U  <  m  <  Mo  +  ZaftU}  =  1  —  a, 

n— >«> 

where  mo,  U,  and  Za/2  are  defined  in  (5.2),  (5.23),  and  (2.12),  respectively. 
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6.  Summary 

This  paper  has  examined  the  question  of  estimating  the  mean  p  in  (2.2)  of  a 
random  binomial  parameter  having  distribution  G(p).  Such  a  problem  arose  in 
the  context  of  measuring  an  unobservable  genetic  trait  in  flocks  of  chickens. 
The  sampling  scheme  upon  which  our  procedures  are  based  involves  observations 
Xi  from  a  density  /(#,•;  mt)  given  by  (1.1),  i  —  1,  •  •  •  ,  n,  where  the  mt-  are 
known,  fixed,  positive  integers.  The  case  in  which  m,  =  m  for  i  =  1,  •  •  •  ,  n  is 
treated  in  Section  2  with  the  confidence  intervals  for  p  being  given  by  (2.11) 
based  on  the  estimator  jft  in  (2.4). 

The  case  in  which  the  m,  differ  is  developed  in  Sections  3,  4,  and  5  with  the 
corresponding  confidence  interval  for  p  given  by  (3.18)  based  on  /Z0  in  (3.10) 
of  <r2  and  r  are  known  and  by  (5.30)  based  on  po  in  (5.2)  if  a2  and  r  are  unknown. 

0  0  0  0  0 

The  authors  would  like  to  thank  Mr.  Min-Chiang  Wang  for  doing  the  com¬ 
puter  programming  of  the  simulation  results  in  Section  2. 


REFERENCES 

[1]  M.  LoihvE,  Probability  Theory,  Princeton,  Van  Nostrand,  1960  (2nd  ed.). 

[2]  E.  S.  Pearson,  “Bayes  theorem  in  the  light  of  experimental  sampling,”  Biometrika,  Vol. 
17  (1925),  pp.  388-442. 

[3]  C.  R.  Rao,  Linear  Statistical  Inference  and  Its  Applications,  New  York,  Wiley,  1965. 


